1,487 research outputs found

    Generic design of Chinese remaindering schemes

    Get PDF
    We propose a generic design for Chinese remainder algorithms. A Chinese remainder computation consists in reconstructing an integer value from its residues modulo non coprime integers. We also propose an efficient linear data structure, a radix ladder, for the intermediate storage and computations. Our design is structured into three main modules: a black box residue computation in charge of computing each residue; a Chinese remaindering controller in charge of launching the computation and of the termination decision; an integer builder in charge of the reconstruction computation. We then show that this design enables many different forms of Chinese remaindering (e.g. deterministic, early terminated, distributed, etc.), easy comparisons between these forms and e.g. user-transparent parallelism at different parallel grains

    The SIGNAL Approach to the Design of System Architectures

    Get PDF
    International audienceModeling plays a central role in system engineering. It significantly reduces costs and efforts in the design by providing developers with means for cheaper and more relevant experimentations. So, design choices can be assessed earlier. The use of a formalism, such as the synchronous language SIGNAL which relies on solid mathematical foundations for the modeling, allows validation. This is the aim of the methodology defined for the design of embedded systems where emphasis is put on formal techniques for verification, analysis, and code generation. This paper mainly focuses on the modeling of architecture components using SIGNAL. For illustration, we consider the modeling of a bounded FIFO queue, which is intended to be used for communication protocols. We bring out the capabilities of SIGNAL to allow specifications in an elegant way, and we check few elementary properties on the resulting model for correctness

    Synchronous modeling of avionics applications using the SIGNAL language

    Get PDF
    International audienceIn this paper, we discuss a synchronous, component-based approach to the modeling of avionics applications. The specification of the components relies on the avionics standard ARINC 653 and the synchronous language SIGNAL is considered as modeling formalism. The POLYCHRONY tool-set allows for a seamless design process based on the SIGNAL model, which provides possibilities of high level specifications, verification and analysis of the specifications at very early stages of the design, and finally automatic code generation through formal transformations of these specifications. This suits the basic stringent requirements that should be met by any design environment for embedded applications in general, and avionics applications in particular

    Toward Static Analysis of SIGNAL Programs using Interval Techniques

    Get PDF
    International audienceThis paper presents a work-in-progress aiming at improving the functional analysis of Signal programs. The usual adopted technique relies on abstractions. Typically, in order to check the presence or absence of variables in a program at some logical instants, the program is transformed into another program that reflects its clock information so that the presence or absence of each variable can be straightforwardly checked. Signal adopts a boolean abstraction for the static functional analysis of programs. This abstraction does not enable to fully reason on the values of non logical variables. Here, we propose a solution based on interval techniques in order to be able to deal with both logical and numerical parts of programs

    Polychronous mode automata

    Get PDF
    International audienceAmong related synchronous programming principles, the model of computation of the Polychrony workbench stands out by its capability to give high-level description of systems where each component owns a local activation clock (such as, typically,distributed real-time systems or systems on a chip). In order to bring the modeling capability of Polychrony to the context of a model-driven engineering toolset for embedded system design, we define a diagramic notation composed of mode automata and data-flow equations on top of the multi-clocked synchronous model of computation supported by the Polychrony workbench. We demonstrate the agility of this paradigm by considering the example of an integrated modular avionics application. Our presentation features the formalization and use of model transformation techniques of the GME environment to embed the extension of Polychrony's meta-model with mode automata

    Preliminary Experiments with XKaapi on Intel Xeon Phi Coprocessor

    Get PDF
    International audienceThis paper presents preliminary performance comparisons of parallel applications developed natively for the Intel Xeon Phi accelerator using three different parallel programming environments and their associated runtime systems. We compare Intel OpenMP, Intel CilkPlus and XKaapi together on the same benchmark suite and we provide comparisons between an Intel Xeon Phi coprocessor and a Sandy Bridge Xeon-based machine. Our benchmark suite is composed of three computing kernels: a Fibonacci computation that allows to study the overhead and the scalability of the runtime system, a NQueens application generating irregular and dynamic tasks and a Cholesky factorization algorithm. We also compare the Cholesky factorization with the parallel algorithm provided by the Intel MKL library for Intel Xeon Phi. Performance evaluation shows our XKaapi data-flow parallel programming environment exposes the lowest overhead of all and is highly competitive with native OpenMP and CilkPlus environments on Xeon Phi. Moreover, the efficient handling of data-flow dependencies between tasks makes our XKaapi environment exhibit more parallelism for some applications such as the Cholesky factorization. In that case, we observe substantial gains with up to 180 hardware threads over the state of the art MKL, with a 47% performance increase for 60 hardware threads

    VtkSMP: Task-based Parallel Operators for VTK Filters

    Get PDF
    International audienceNUMA nodes are potentially powerful but taking benefit of their capabilities is challenging due to their architec- ture (multiple computing cores, advanced memory hierarchy). They are nonetheless one of the key components to enable processing the ever growing amount of data produced by scientific simulations. In this paper we study the parallelization of patterns commonly used in VTK algorithms and propose a new multi- threaded plugin for VTK that eases the development of parallel multi-core VTK filters. We specifically focus on task-based approaches and show that with a limited code refactoring effort we can take advantage of NUMA node capabilities. We experiment our patterns on a transform filter, base isosurface extraction filter and a min/max tree accelerated isosurface extraction. We support 3 programming environments, OpenMP, Intel TBB and X-KAAPI, and propose different algorithmic refinements according to the capabilities of the target environment. Results show that we can speed execution up to 30 times on a 48-core machine

    Flexible Rollback Recovery in Dynamic Heterogeneous Grid Computing

    Get PDF
    Abstract—Large applications executing on Grid or cluster architectures consisting of hundreds or thousands of computational nodes create problems with respect to reliability. The source of the problems are node failures and the need for dynamic configuration over extensive runtime. This paper presents two fault-tolerance mechanisms called Theft-Induced Checkpointing and Systematic Event Logging. These are transparent protocols capable of overcoming problems associated with both benign faults, i.e., crash faults, and node or subnet volatility. Specifically, the protocols base the state of the execution on a dataflow graph, allowing for efficient recovery in dynamic heterogeneous systems as well as multithreaded applications. By allowing recovery even under different numbers of processors, the approaches are especially suitable for applications with a need for adaptive or reactionary configuration control. The low-cost protocols offer the capability of controlling or bounding the overhead. A formal cost model is presented, followed by an experimental evaluation. It is shown that the overhead of the protocol is very small, and the maximum work lost by a crashed process is small and bounded. Index Terms—Grid computing, rollback recovery, checkpointing, event logging. Ç

    Optimized Coordinated Checkpoint/Rollback Protocol using a Dataflow Graph Model

    Get PDF
    Fault-tolerance protocols play an important role in today long runtime scienti\ufb01c parallel applications. The probability of a failure may be important due to the number of unreliable components involved during an execution. We present our approach and preliminary results about a new checkpoint/rollback protocol based on a coordinated scheme. The application is described using a dataflow graph, which is an abstract representation of the execution. Thanks to this representation, the fault recovery in our protocol only requires a partial restart of other processes. Simulations on a domain decomposition application show that the amount of computations required to restart and the number of involved processes are reduced compared to the classical global rollback protocol
    • …
    corecore